## Implementing the Design

#### Introduction

This lab continues with the previous lab. You will perform static timing analysis. You will implement the design with the default settings and generate a bitstream. Then you will open a hardware session and program the FPGA. You will use on-board UART of the Nexys4 DDR, the Basys3, or the Nexys Video board to validate your design.

## **Objectives**

After completing this lab, you will be able to:

- Implement the design
- Generate various reports and analyze the results
- Run static timing analysis
- Generate bitstream and verify the functionality in hardware

#### **Procedure**

This lab is broken into steps that consist of general overview statements providing information on the detailed instructions that follow. Follow these detailed instructions to progress through the lab.

#### **General Flow**



In the instructions below;

{sources} refers to: C:\xup\fpga\_flow\2016\_2\_artix7\_sources

{ labs } refers to : C:\xup\fpga\_flow\2016\_2\_artix7\_labs

Board support for the Basys3, Nexys4 DDR, Nexys Video is not included in Vivado 2016.2 by default. The relevant zip files need to be extracted and saved to: {Vivado installation}\data\boards\board files\.

These files can be downloaded either from the Digilent, Inc. webpage (<a href="https://reference.digilentinc.com/vivado/boardfiles2015">https://reference.digilentinc.com/vivado/boardfiles2015</a>) or the XUP webpage (<a href="http://www.xilinx.com/support/university/vivado/vivado-workshops/Vivado-fpga-design-flow.html">http://www.xilinx.com/support/university/vivado/vivado-workshops/Vivado-fpga-design-flow.html</a>) where this material is also hosted.



## **Open a Vivado Project using IDE**

Step 1

- 1-1. Launch Vivado and open the lab2 project. Save the project as lab3 in the {labs} directory making sure that the create subdirectory option is selected. Set the flatten\_hierarchy setting to rebuilt. Create new synthesis run naming it as synth\_2.
- **1-1-1.** Start the Vivado if necessary and open either the lab2 project (lab2.xpr) you created in the previous lab or the lab2 project in the labsolution directory using the **Open Project** link in the Getting Started page.
- **1-1-2.** Select **File > Save Project As** ... to open the *Save Project As* dialog box. Enter **lab3** as the project name. Make sure that the *Create Project Subdirectory* option is checked, the project directory path is **{labs}** and click **OK**.
- **1-1-3.** Click on the **Synthesis Settings** in the *Flow Navigator* pane.
- **1-1-4.** Make sure that the *flatten\_hierarchy* is set to **rebuilt**, which allows the design hierarchy to be preserved for synthesis, and then rebuilt which is more useful for design analysis because many logical references will be maintained.



Figure 1. Setting hierarchy to rebuilt

#### 1-1-5. Click OK.

A Create New Run dialog box will appear asking you if a new run should be created. Click **Yes** and then **OK** to create the new run with **synth\_2** name.



# 1-2. Synthesize the design. Generate the timing summary and analyze the design.

1-2-1. Click on Run Synthesis under the Synthesis tasks of the Flow Navigator pane.

The synthesis process will be run on the uart\_led.v and all its hierarchical files. When the process is completed a *Synthesis Completed* dialog box with three options will be displayed.

- **1-2-2.** Select the *Open Synthesized Design* option and click **OK** as we want to look at the synthesis output.
- **1-2-3.** Click on **Report Timing Summary** under the *Synthesized Design* tasks of the *Flow Navigator* pane.
- 1-2-4. Leave all the settings unchanged, and click **OK** to generate a default timing report, timing\_1.



Figure 2. Timing report for the Nexys4 DDR



Figure 2. Timing report for the Basys3



Figure 2. Timing report for the Nexys Video

- 1-2-5. Click on the link beside the Worst Negative Slack (WNS) and see the 8 failing paths.
- **1-2-6.** Double-click on the Path 23 to see a detailed view of the path. The path report shows four sections: (i) Summary, (ii) Source Clock Path, (iii) Data Path, and (iv) Destination Clock Path.



**1-2-7.** Select Path 23 in the timing summary panel, or the Path summary view, right-click, and select **Schematic**.

The schematic for the output data path will be displayed.



Figure 3. The output data path

**1-2-8.** In order to see how the Source Clock Path is made up in schematic form, double-click on left end of the C pin of the FDRE in the schematic.

This will show the net between the BUFG and C port of the FDRE.

1-2-9. Similarly, double-click on the left end of the BUFG to see the path between IBUF and BUFG.



Figure 4. Source to clock port of the FDRE

**1-2-10.** Finally, double-click on the input pin of IBUF to see the path between the clock input pin and the IBUF.





Figure 5. The schematic view of the source clock path

This corresponds to the Source Clock Path in the timing report.

| Source Clock Path         |            |           |          |                                |  |
|---------------------------|------------|-----------|----------|--------------------------------|--|
| Delay Type                | Incr (ns)  | Path (ns) | Location | Netlist Resource(s)            |  |
| (clock clk_pin rise edge) | (r) 10.000 | 10.000    |          |                                |  |
|                           | (r) 0.000  | 10.000    | Site: E3 | □ clk_pin                      |  |
| net (fo=0)                | 0.000      | 10.000    |          | ∠ clk_pin                      |  |
|                           |            |           | Site: E3 | clk_pin_IBUF_inst/I            |  |
| IBUF (Prop ibuf I O)      | (r) 1.482  | 11.482    | Site: E3 | <pre>clk_pin_IBUF_inst/O</pre> |  |
| net (fo=1, unplaced)      | 0.803      | 12.285    |          | ∠ clk_pin_IBUF                 |  |
|                           |            |           |          | clk_pin_IBUF_BUFG_inst/I       |  |
| BUFG (Prop bufg I O)      | (r) 0.096  | 12.381    |          | clk_pin_IBUF_BUFG_inst/O       |  |
| net (fo=48, unplaced)     | 0.584      | 12.965    |          | √ led_ctl_i0/CLK               |  |
| FDRE                      |            |           |          | led_ctl_i0/led_o_reg[1]/C      |  |

Figure 6. The source clock path for the Nexys4 DDR



Figure 6. The source clock path for the Basys3



Figure 6. The source clock path for the Nexys Video

Since the virtual clock is slower (12 ns) than the clk\_pin period (10 ns), the data path delay includes the clock period of the clk\_pin clock source.





Figure 7. Worst failing path for the Nexys4 DDR

| Source Clock Path               |            |           |           |                                     |
|---------------------------------|------------|-----------|-----------|-------------------------------------|
| Delay Type                      | Incr (ns)  | Path (ns) | Location  | Netlist Resource(s)                 |
| (clock clk_pin rise edge)       | (r) 10.000 | 10.000    |           |                                     |
| . = 3,                          | (r) 0.000  | 10.000    | Site: W5  | D clk pin                           |
| net (fo=0)                      | 0.000      | 10.000    |           | ∠ clk_pin                           |
|                                 |            |           | Site: W5  | clk_pin_IBUF_inst/I                 |
| IBUF (Prop ibuf I O)            | (r) 1.458  | 11.458    | Site: W5  | clk_pin_IBUF_inst/O                 |
| net (fo=1, unplaced)            | 0.800      | 12.258    |           | ∠ clk_pin_IBUF                      |
|                                 |            |           |           | clk_pin_IBUF_BUFG_inst/I            |
| BUFG (Prop bufg I O)            | (r) 0.096  | 12.354    |           | <pre>clk_pin_IBUF_BUFG_inst/O</pre> |
| net (fo=48, unplaced)           | 0.584      | 12.938    |           | ∠ led_ctl_i0/CLK                    |
| FDRE                            |            |           |           | led_ctl_i0/led_o_reg[1]/C           |
| Data Path                       |            |           |           |                                     |
| Delay Type                      | Incr (ns)  | Path (ns) | Location  | Netlist Resource(s)                 |
| FDRE (Prop fdre C Q)            | (r) 0.456  | 13.394    |           | led_ctl_i0/led_o_reg[1]/Q           |
| net (fo=1, unplaced)            | 0.800      | 14.194    |           | ✓ led_pins_OBUF[1]                  |
|                                 |            |           | Site: E19 | led_pins_OBUF[1]_inst/I             |
| OBUF (Prop obuf I O)            | (r) 3.705  | 17.899    | Site: E19 | <pre>led_pins_OBUF[1]_inst/O</pre>  |
| net (fo=0)                      | 0.000      | 17.899    |           | → led_pins[1]                       |
|                                 |            |           | Site: E19 | □ led_pins[1]                       |
| Arrival Time                    |            | 17.899    |           |                                     |
| Destination Clock Path          |            |           |           |                                     |
| Delay Type                      | Incr (ns)  | Path (ns) | Location  | Netlist Resource(s)                 |
| (clock virtual_clock rise edge) | (r) 12.000 | 12.000    |           |                                     |
| ideal clock network latency     | 0.000      | 12.000    |           |                                     |
| clock pessimism                 | 0.000      | 12.000    |           |                                     |
| clock uncertainty               | -0.025     | 11.975    |           |                                     |
| output delay                    | -0.000     | 11.975    |           |                                     |
| Required Time                   |            | 11.975    |           |                                     |

Figure 7. Worst failing path for the Basys3





Figure 7. Worst failing path for the Nexys Video

- 1-3. Change the design constraint to constrain the virtual clock period to 10ns. Re-synthesize the design and analyze the results.
- 1-3-1. Click Edit Timing Constraints under the Synthesized Design.

The Timing Constraints GUI will appear, showing the design has two create clocks, four inputs, and one output constraints. It also shows the constraints in the text form in the All Constraints section.



Figure 8. Timing Constraints showing 12 ns Virtual Clock period defined

1-3-2. Click in the Period cell of the virtual\_clock and change the period from 12 to 10





#### **1-3-3.** Click **Apply**.

Note that since the timing constraint has changed, a warning message in the console pane is displayed to rerun the report.

Report is out of date because timing data has been modified. Rerun

#### 1-3-4. Click on Rerun.

Notice that setup timing violations are gone. However, there are still 2 failing paths for the Hold.



Figure 9. Setup timing met for the Nexys4 DDR



Figure 9. Setup timing met for the Basys3



Figure 9. Setup timing met for the Nexys Video

- **1-3-5.** Click on the WHS link to see the paths.
- **1-3-6.** Double-click on the first path to see the timing compositions. Notice that the clock path delay does not include the entire clock period.
- 1-3-7. Select File > Save Constraints...
- **1-3-8.** Click **OK** and then click **Yes** to save the synthesized design.

Notice that the Synthesis Out-of-Date status is displayed on the top-right corner.



## Implement the Design

Step 2

- 2-1. Run the implementation after saving the synthesis run. Perform the timing analysis.
- **2-1-1.** In the Design Runs tab, right-click on the synth\_2 and select **Reset Runs**. Make sure the generated files are deleted. Click **Reset**.
- 2-1-2. Click the Close Design link in the status bar. If prompted, do not save anything.
- **2-1-3.** Click on the **Run Implementation** in the *Flow Navigator* pane.
- **2-1-4.** Click **OK** when prompted to run the synthesis first before running the implementation process.

When the implementation is completed, a dialog box will appear with three options.

- **2-1-5.** Select the *Open Implemented Design* option and click **OK**.
- 2-2. View the amount of FPGA resources consumed by the design using Report Utilization.
- **2-2-1.** In the *Flow Navigator* pane, select **Open Implemented Design > Report Utilization**.

The Report Utilization dialog box opens.

2-2-2. Click OK.

The utilization report is displayed at the bottom of the Vivado IDE. You can select any of the resources on the left to view its corresponding utilization.

**2-2-3.** Select Slice LUTs to view how much and which module consumes the resource.



Figure 10. Resource utilization for the Nexys4 DDR





Figure 10. Resource utilization for the Basys3



Figure 10. Resource utilization for the Nexys Video

### 2-3. Generate a timing summary report.

**2-3-1.** In the Flow Navigator, under Implementation > Implemented Design, click **Report Timing Summary** 

The Report Timing Summary dialog box opens.

2-3-2. Leave all the settings unchanged and click OK to generate the report.



Figure 11. The timing summary report showing timing violations for the Nexys4 DDR



Figure 11. The timing summary report showing timing violations for the Basys3





Figure 11. The timing summary report showing timing violations for the Nexys Video

- **2-3-3.** Click on the WNS link to see a detailed report to determine the failing path entries.
- **2-3-4.** Double-click on the first failing path to see why it is failing.



Figure 12. First failing path delays for the Nexys4 DDR

| Source Clock Path               |            |           |                     |                                      |
|---------------------------------|------------|-----------|---------------------|--------------------------------------|
| Delay Type                      | Incr (ns)  | Path (ns) | Location            | Netlist Resource(s)                  |
| (clock clk_pin rise edge)       | (r) 0.000  | 0.000     |                     |                                      |
|                                 | (r) 0.000  | 0.000     | Site: W5            | D clk_pin                            |
| net (fo=0)                      | 0.000      | 0.000     |                     | ∠ clk_pin                            |
|                                 |            |           | Site: W5            | clk_pin_IBUF_inst/I                  |
| IBUF (Prop ibuf I O)            | (r) 1.458  | 1.458     | Site: W5            | clk_pin_IBUF_inst/O                  |
| net (fo=1, routed)              | 1.967      | 3.425     |                     | ∠ clk_pin_IBUF                       |
|                                 |            |           | Site: BUFGCTRL_X0Y0 | clk_pin_IBUF_BUFG_inst/I             |
| BUFG (Prop bufg I O)            | (r) 0.096  | 3.521     | Site: BUFGCTRL_X0Y0 | <pre>clk_pin_IBUF_BUFG_inst/C</pre>  |
| net (fo=48, routed)             | 1.622      | 5.143     |                     | ∠ led_ctl_i0/CLK                     |
| FDRE                            |            |           | Site: SLICE_X5Y20   | led_ctl_i0/led_o_reg[5]/C            |
| Data Path                       |            |           |                     |                                      |
| Delay Type                      | Incr (ns)  | Path (ns) | Location            | Netlist Resource(s)                  |
| FDRE (Prop fdre C Q)            | (r) 0.419  | 5.562     | Site: SLICE_X5Y20   | <pre>led_ctl_i0/led_o_reg[5]/Q</pre> |
| net (fo=1, routed)              | 2.119      | 7.681     |                     | → led_pins_OBUF[5]                   |
|                                 |            |           | Site: U15           | led_pins_OBUF[5]_inst/I              |
| OBUF (Prop obuf I O)            | (r) 3.689  | 11.371    | Site: U15           | <pre>led_pins_OBUF[5]_inst/O</pre>   |
| net (fo=0)                      | 0.000      | 11.371    |                     | ∠ led_pins[5]                        |
|                                 |            |           | Site: U15           | □ led_pins[5]                        |
| Arrival Time                    |            | 11.371    |                     |                                      |
| Destination Clock Path          |            |           |                     |                                      |
| Delay Type                      | Incr (ns)  | Path (ns) | Location            | Netlist Resource(s)                  |
| (clock virtual_clock rise edge) | (r) 10.000 | 10.000    |                     |                                      |
| ideal clock network latency     | 0.000      | 10.000    |                     |                                      |
| clock pessimism                 | 0.000      | 10.000    |                     |                                      |
| clock uncertainty               | -0.025     | 9.975     |                     |                                      |
| output delay                    | -0.000     | 9.975     |                     |                                      |
| Required Time                   |            | 9,975     |                     |                                      |

Figure 12. First failing path delays for the Basys3

| Source Clock Path               |            |           |                     |                                      |  |  |  |
|---------------------------------|------------|-----------|---------------------|--------------------------------------|--|--|--|
| Delay Type                      | Incr (ns)  | Path (ns) | Location            | Netlist Resource(s)                  |  |  |  |
| (clock clk_pin rise edge)       | (r) 0.000  | 0.000     |                     |                                      |  |  |  |
| `                               | (r) 0.000  | 0.000     | Site: R4            | D clk_pin                            |  |  |  |
| net (fo=0)                      | 0.000      | 0.000     |                     | ∠ clk_pin                            |  |  |  |
|                                 |            |           | Site: R4            | clk_pin_IBUF_inst/I                  |  |  |  |
| IBUF (Prop ibuf I O)            | (r) 1.475  | 1.475     | Site: R4            | clk_pin_IBUF_inst/O                  |  |  |  |
| net (fo=1, routed)              | 2.114      | 3.589     |                     | ∠ clk_pin_IBUF                       |  |  |  |
|                                 |            |           | Site: BUFGCTRL_X0Y0 |                                      |  |  |  |
| BUFG (Prop bufg I O)            | (r) 0.096  |           | Site: BUFGCTRL_X0Y0 |                                      |  |  |  |
| net (fo=48, routed)             | 2.017      | 5.702     |                     | ∠ led_ctl_i0/CLK                     |  |  |  |
| FDRE                            |            |           | Site: SLICE_X1Y83   | led_ctl_i0/led_o_reg[3]/C            |  |  |  |
| Data Path                       |            |           |                     |                                      |  |  |  |
| Delay Type                      | Incr (ns)  | Path (ns) | Location            | Netlist Resource(s)                  |  |  |  |
| FDRE (Prop fdre C Q)            | (r) 0.456  | 6.158     | Site: SLICE_X1Y83   | <pre>led_ctl_i0/led_o_reg[3]/Q</pre> |  |  |  |
| net (fo=1, routed)              | 2.156      | 8.314     |                     | ∠ led_pins_OBUF[3]                   |  |  |  |
|                                 |            |           | Site: U16           | led_pins_OBUF[3]_inst/I              |  |  |  |
| OBUF (Prop obuf I O)            | (r) 2.917  | 11.231    | Site: U16           | <pre>led_pins_OBUF[3]_inst/O</pre>   |  |  |  |
| net (fo=0)                      | 0.000      | 11.231    |                     | ∠ led_pins[3]                        |  |  |  |
|                                 |            |           | Site: U16           | □ led_pins[3]                        |  |  |  |
| Arrival Time                    |            | 11.231    |                     |                                      |  |  |  |
| Destination Clock Path          |            |           |                     |                                      |  |  |  |
| Delay Type                      | Incr (ns)  | Path (ns) | Location            | Netlist Resource(s)                  |  |  |  |
| (clock virtual_clock rise edge) | (r) 10.000 | 10.000    |                     |                                      |  |  |  |
| ideal clock network latency     | 0.000      | 10.000    |                     |                                      |  |  |  |
| clock pessimism                 | 0.000      | 10.000    |                     |                                      |  |  |  |
| clock uncertainty               | -0.025     | 9.975     |                     |                                      |  |  |  |
| output delay                    | -0.000     | 9.975     |                     |                                      |  |  |  |
| Required Time                   |            | 9.975     |                     |                                      |  |  |  |

Figure 12. First failing path delays for the Nexys Video

Compared to delays from the synthesis report, the net delays are actual delays (rather than an estimated figure). The data path delay is longer than the destination clock path delay giving a negative slack (violation). The data path delay is 11.534 ns for the Nexys4 DDR, the destination clock path is 9.975 ns and the negative slack is -1.559 ns.

The figures are 11.381ns, 9.975 ns and -1.406 ns respectively for the Basys3.

At this point we can ignore this violation as the LED display change by a few nanoseconds won't be observable by human eyes. We can also change the output delay by -2 ns and make the timings meet.



- **2-3-5.** Select **Implemented Design > Edit Timing Constraints** the *Flow Navigator* pane.
- 2-3-6. Select the Set Output Delay entry in the left pane, and change the Delay Value to -2.000 ns.
- 2-3-7. Click Apply.
- **2-3-8.** Click **Rerun** link to re-run the timing report.

Observe that the timing violations of the Intra-clock paths are gone.

- **2-3-9.** Expand the **Intra-Clock Paths** folder on the left, expand *clk\_pin*, and select the Setup group to see the list of 10 worst case delays on the right side.
- **2-3-10.** Double-click on the any path to see how that is made up of. Also right-click on it and select **Schematic.**

Click on the **Device** tab and see the highlighted path in the view.

- 2-3-11. Select Implemented Design > Report Clock Networks.
- 2-3-12. Click OK.

The Clock Networks report will be displayed in the Console pane showing two clock net entries.

**2-3-13.** Select *clk pin* entry and observe the selected nets in the Device view.

The clock nets are spread across multiple clock regions.



Figure 13. Clock nets for the Nexys4 DDR





Figure 13. Clock nets for the Basys3



Figure 13. Clock nets for the Nexys Video

## **Generate the Bitstream**

Step 3

#### 3-1. Generate the bitstream.

**3-1-1.** In the Flow Navigator, under Program and Debug, click **Generate Bitstream.** 



Figure 14. Generating the bitstream



**3-1-2.** Click **Save** to save the constraints since the timing constraints had been changed, click **OK**, and then **Yes** to reset the runs and re-run all the processes.

The write\_bitstream command will be executed (you can verify it by looking in the Tcl console).

**3-1-3.** Click **Cancel** when the bitstream generation is completed.

## **Verify the Functionality**

Step 4

- 4-1. Connect the board and power it ON. Open a hardware session, and program the FPGA.
- **4-1-1.** Make sure that the micro-USB cable is connected to the JTAG PROG connector (next to the power supply connector). Make sure that the jumper on the board is set to select USB power (JP3 for the Nexys4 DDR and JP2 for the Basys3).
- **4-1-2.** Select the *Open Hardware Manager* option and click **OK**.

The Hardware Manager window will open indicating "unconnected" status.

**4-1-3.** Click on the **Open target** link, then **Auto Connect** from the dropdown menu.



Figure 15. Opening new hardware target

- **4-1-4.** The Hardware Session status changes from Unconnected to the server name and the device is highlighted. Also notice that the Status indicates that it is not programmed.
- **4-1-5.** Select the device in the *Hardware Device Properties*, and verify that the **uart\_led.bit** is selected as the programming file in the General tab.
- 4-2. Start a terminal emulator program such as TeraTerm or HyperTerminal. Select an appropriate COM port (you can find the correct COM number using the Control Panel). Set the COM port for 115200 baud rate communication. Program the FPGA and verify the functionality.
- **4-2-1.** Start a terminal emulator program such as TeraTerm or HyperTerminal.
- **4-2-2.** Select an appropriate COM port (you can find the correct COM number using the Control Panel).
- **4-2-3.** Set the COM port for 115200 baud rate communication.
- 4-2-4. Right-click on the FPGA entry in the Hardware window and select **Program Device...**
- **4-2-5.** Click on the **Program** button.



The programming bit file will be downloaded and the DONE light will be turned ON when the FPGA has been programmed.

- **4-2-6.** Type in some characters in the terminal emulator window and see the corresponding ASCII equivalent bit pattern displayed on the LEDs.
- **4-2-7.** Press and hold BTNU and see the the upper four bits are swapped with the lower four bits on the LEDs.
- **4-2-8.** When satisfied, close the terminal emulator program and power OFF the board.
- 4-2-9. Select File > Close Hardware Manager. Click OK.
- **4-2-10.** Close the **Vivado** program by selecting **File > Exit** and click **OK**.

#### Conclusion

In this lab, you learned about many of the reports available to designers in the Vivado IDE. You had the opportunity to learn basic design analysis tools including the Schematic viewer, delay path properties and reports viewer, Device viewer, and selecting primitive parents. You also learned about the basic timing report options that are at your disposal. You verified the functionality in hardware by typing characters on the host machine and seeing the LED pattern changes.

